Coxiella burnetii Genotyping

Multispacer sequence typing is the first reliable method for typing Coxiella burnetii isolates.

Coxiella burnetii is a strict intracellular bacterium with potential as a bioterrorism agent. To characterize different isolates of C. burnetii at the molecular level, we performed multispacer sequence typing (MST). MST is based on intergenic region sequencing. These regions are potentially variable since they are subject to lower selection pressure than the adjacent genes. We screened 68 spacers in 14 isolates and selected the 10 that exhibited the most variation. These spacers were then tested in 159 additional isolates obtained from different geographic areas or different hosts or were implicated in different manifestations of human disease caused by C. burnetii. The sequence analysis yielded 30 different allelic combinations. Phylogenic analysis showed 3 major clusters. MST allows easy comparison and exchange of results obtained in different laboratories and could be a useful tool for identifying bacterial strains.
C oxiella burnetii is a strict intracellular microorganism, included in the γ subdivision of the Proteobacteria phylum (1). It is found in close association with arthropod and vertebrate hosts, and it causes Q fever in humans and animals. Cattle, goats, and sheep are the primary reservoirs of human infection. In humans, the disease may appear in 2 forms, acute and chronic (2). Acute Q fever may be asymptomatic or appear as atypical pneumonia, granulomatous hepatitis, or self-limited febrile illness. In some persons, the immune system is unable to control the infection and chronic Q fever occurs. The manifestations of chronic Q fever are endocarditis, hepatitis, osteomyelitis, or infected aortic aneurysms. C. burnetii is highly infectious by the aerosol route and can survive for long periods in the environment.
Several other methods have been used to type different isolates of the same species, in particular, multilocus enzyme electrophoresis (15) and multilocus sequence typing (MLST) (16). Many bacterial species have been studied by using these approaches (17)(18)(19).
Recently, the whole genome of the C. burnetii Nine Mile strain was sequenced (14). We decided to investigate parts of the genome located between 2 open reading frames (ORFs) because they are considered potentially variable since they are subject to lower selection pressure than the adjacent genes. The 16S/23S ribosomal spacer region has been widely used to genotype bacteria (20)(21)(22)(23). We investigated the utility of multispacer sequence typing (MST) with 173 C. burnetii isolates. After screening, we selected 10 variable spacers and showed that the combination of the different sequences allowed us to characterize 30 different genotypes. Phylogenetic analysis inferred from compiled sequences characterized 3 monophyletic groups, which could be subdivided into different clusters.

Multispacer Sequence Typing
The whole genome of C. burnetii was accessible in the NCBI server (GenBank NC 002971). We kept spacers that were 300-700 bp in length. Primers were chosen in neighboring genes to allow polymerase chain reaction (PCR) amplification at 57°C and are listed in Table 1. Each PCR was carried out in a T3 Thermocycler Biometra (Biolabo, Archamps, France). Two microliters of the DNA preparation was amplified in a 50-µL reaction mixture containing 200 µmol/L of each primer, 200 µmol/L (each) dATP, dCTP, dGTP, and dTTP (Invitrogen), 1.5 U Taq DNA polymerase (Roche, Meylan, France) in 1x Taq buffer. Amplifications were carried under the following conditions: initial denaturation of 10 min at 95°C, followed by 37 cycles of denaturation for 30 s at 95°C, annealing for 30 s at 57°C, and extension for 1 min at 72°C. PCR products were purified and sequenced as previously described (25).
PCR products were cloned in PGEM-T Easy Vector (Promega, Charbonnières, France) according to the manufacturer's instructions. Ten clones were cultivated in LB medium (USB, Cleveland, OH, USA) overnight, and PCR and sequencing were performed as described previously.
Plasmid Sequence Type, com1 Type, and djlA Type Determination PCR for QpH1 and QpRS sequence plasmids were performed with the primers previously described QpH11/12 and QpRS01/02 (5). PCR was carried out as described for MST, except that annealing temperature was 55°C and cycle number was 35. PCR primers for QpDV and QpRS sequence plasmid amplification were chosen after comparison of the entire sequence of the 2 plasmids. The primers were QpDV1f and QpDV1r. PCR amplification was carried out at 63°C for 30 cycles. PCR was performed as previously described for com1 and djlA (13) (Appendix Table  2, available at http://www.cdc.gov/ncidod/EID/vol11no08/ 04-1354_app.htm#table2).

Data Analysis
Statistical analyses were performed by using the chisquare test in the program EpiInfo 6 (26). The spacer sequences were compiled and aligned by using the multisequence alignment program ClustalX (1.8). The phylogenetic relationships between the C. burnetii isolates were determined by using Mega version 2.0 (27). A matrix of pairwise differences in allele profiles was constructed, and the similarities between the allelic profiles of the isolates were assessed by cluster analysis using the unweighted pair-group method with arithmetic mean (UPGMA). Another analysis of the results was performed by using the BURST algorithm (http://www.mlst.net), which defines clonal complexes in which every isolate shares at least 5 identical alleles with at least 1 other isolate (Cox2, Cox5, Cox18, Cox20, Cox37, Cox56, and Cox57 were kept for the analysis) and characterizes ancestral genotypes. C. burnetii MST database was entered at the following website: http://ifr48.timone.univ-mrs.fr, and ST determination by sequence comparison is possible at this site.

Choice of Spacers for Typing and Analysis by MST
Initially 14 isolates were chosen to test the genetic diversity of the spacers: Nine Mile, Priscilla, Q212, Heizberg, Brasov, Dog ut Ad, CB15, CB20, CB26, CB28, CB33, CB35, CB114, and CB115. We chose 68 spacers, but we retained only 51 spacers for which PCR amplification was obtained for all the isolates. We kept 10 spacers (Cox2, Cox5, Cox18, Cox20, Cox22, Cox37, Cox51, Cox56, Cox57, and Cox61) ( Table 1) because they were representative of the results found when we analyzed the entire test set of 51 spacers. For each spacer, the number of variable sites in the sequences was determined, and the percentage of variability was calculated. They were, respectively, 1.1, 1.4, 1.9, 0.7, 2.3, 1.2, 1.4, 2.5, 1.7, and 2.1. We kept Cox18, Cox22, Cox51, Cox56, Cox57, and Cox61 because the percentage of variability in these spacers was high compared with the other spacers. We kept Cox2, Cox5, Cox20, and Cox37 because they allowed the characterization of CB35, CB15, CB26 and CB28, and Nine Mile respectively. To test the reliability of the spacers we kept, chi-square value was determined by using the value of 1% as the threshold value. The Fisher value was found to be statistically significant (9 × 10 -4 ). We then added 159 other isolates. Sequences were obtained for all the isolates with spacers Cox2, Cox18, Cox20, Cox22, Cox37, Cox51, and Cox57. Mixed sequences were obtained with the isolate Poker Cat with spacers Cox5, Cox56, and Cox61. We cloned the PCR products and showed that several sequences were present after PCR amplification, including insertions or deletions. Allele distribution of the different gene spacers are described in Table 2. Each of the different sequences in a locus defined a distinct genotype, even if it differed from the others by only a single nucleotide. Thirty different sequence types (STs) were identified by using MST.

Computer Analysis of MST Data
The dendrogram in the Figure was constructed from a matrix of pairwise allelic differences between the compiled sequences of the 30 STs. We identified 3 monophyletic groups within the tree. The first group, representing 13 different STs, included isolates from France, Spain, Russia, Kyrgyzstan, Namibia, Kazakhstan, Ukraine, Uzbekistan, and the United States. It was divided in 2 subgroups. The first one included 36 isolates representing 8 different STs (ST1 to ST7 and ST30). Nineteen were represented by ST1. The second subgroup included 39 isolates which represented 5 different STs (ST8, ST9, ST10, ST26, and ST28). Twenty-eight were represented by ST8.
The third group consisted of only 1 ST, ST21, and included the 7 Canadian isolates, 2 isolates from France (CB4 and CB7), and 1 isolate from the United States (Scurry). The clusters determined by the BURST algorithm were consistent with those determined by the phylogenetic analysis. Five groups were defined. The first one included ST1 to ST7; the putative ancestral genotype in this group was ST1. ST8 (putative ancestral genotype), ST9, ST10, ST26, and ST28 were included in the second group; ST11, ST12 (putative ancestral genotype), ST13, ST14, ST15, and ST24 in the third group, ST16 and ST17 in the fourth group; and ST18 (putative ancestral genotype), ST22, ST23, ST25, and ST29 in the fifth group. ST19, ST20, ST21, and ST30 were considered as singletons.

Discussion
Q fever in humans and animals, caused by C. burnetii, is found worldwide. In humans, it causes a variety of diseases such as acute flulike illness, pneumonia, hepatitis, and chronic endocarditis. In animals, C. burnetii is found in the reproductive system, both uterus and mammary glands and may cause abortion or infertility.
Molecular methods are now almost universally used to characterize strains and to determine the relatedness between isolates causing diseases in different contexts. The most discriminative approach used for C. burnetii isolates until this study was PFGE. Twenty different restriction patterns were distinguished after NotI restriction of total C. burnetii DNA and PFGE (11). Comparison of PFGE profiles is sometimes difficult because good separation of the different fragments is required. For example, the isolate Heizberg was classified in group 1 by Thiele et al. (10) and in group 2 by Jäger et al. (11). This fact highlights the difficulty of comparing results obtained by this technique. Moreover, in some species, rapid genomic rearrangements occur because of repeats or insertion sequences, so even if isolates descended from a common ancestor that arose several decades ago, they may not readily be seen to be minor variants of the same clone. In these cases, PFGE does not contribute to tracing of isolates. The great advantage of MST over PFGE as a typing method is the lack of ambiguity and the portability of sequence data, which allow results from different laboratories to be compared without exchanging strains. This work is the first to include so many isolates in a rigorous examination of molecular epidemiology. The study of this bank of sequences will contribute to understanding the propagation mode of the bacteria as variations accumulate relatively slowly, thus making it an ideal tool for global epidemiology. For example, in ST16 we characterized isolates that were obtained from 1935 (Nine Mile) to 1991 (CB25).
Most of the French isolates were included in monophyletic group 1. Nineteen were included in ST1, and 24 were included in ST8. Thus, an isolate has a geographic distribution even if genetic modifications appear (insertions, deletions or mutations) over time, giving rise to a new ST that is related to the ancestor isolate. This fact was highlighted when the analysis of the STs was performed by using the BURST algorithm. ST1 and ST8 were described as the ancestral genotypes and for example, ST9 and ST10 corresponded to SLVs of ST8 (isolates that differ at only 1 of the 7 loci) and ST26 and ST28 corresponded to DLVs of ST8 (double locus variants). But some types were not delineated on the basis of geographic origin because they were isolated from different parts of the world. This distribution in distant countries is likely related to movements of infected patients, animals, or ticks. This is particularly true for ST16 isolates that were encountered on 4 different continents, America, Europe, Asia, and Africa. The homology of the Canadian isolates from Nova Scotia should be noted. Q fever is just as endemic in Nova Scotia as in France. This may indicate rapid and recent spreading of a single strain. The association between ST21 and Canada is significant as tested with the chi-square test with a Fisher value <10 -8 . Notably, patient CB115, who had Q fever endocarditis, was living in Edmonton, Alberta (≈3,000 miles from Nova Scotia) when this illness was diagnosed. He grew up in Nova Scotia, and the molecular epidemiologic findings show that he acquired his disease there. Q fever is uncommon in Alberta. Most of the STs are found in Europe. A sample bias could exist as most of the isolates tested were from this continent, but the results obtained may also indicate that C. burnetii originated from the Old World and spread later in the New World, excluding New Zealand.
Concordant results were found when MST was compared with com1 and djlA sequences comparison (Figure). However MST was more discriminant. Plasmid profile investigation of C. burnetii detected 4 different plasmids QpH1, QpRS, QpDV, and QpDG and 1 group of plasmidless isolates. QpH1 was first found in the Nine Mile tick isolate (28). QpRS was first found in the goat isolate Priscilla (29). QpDG was described from isolates obtained from feral rodents near Dugway, Utah (8). QpDV was found in French and Russian isolates (5,6). Another not-well-characterized plasmid type was described in China (30). The existence of a plasmidless C. burnetii isolate, Scurry Q217 was described (31), but a chromosomally integrated plasmid-homologous DNA fragment was found in this isolate by hybridization (32,33). Plasmid type sequence detection was also correlated with MST. Group 2 included isolates that PCR amplification found to be positive with primers specific for QpH1. Group 3 included 3 isolates, 2 from France (CB4 and CB7) and 1 from Nova Scotia (Poker Cat), in which plasmid sequence type of QpH1 was detected. No such sequence was detected in the other isolates of Nova Scotia origin included in group 3. Group 1 included isolates that were positive by PCR amplification with primers specific for QpRS (47/77). QpDV plasmid was described in isolates from France, Spain, Ukraine, and Kyrgyzstan. In fact, regions shared by QpH1, QpRS, and QpDV were termed "core plasmid sequences" and encompassed 25 kb. QpH1, QpRS, and QpDV are, respectively, 37 kb, 39 kb, and 33 kb in size. Integrated sequences in American isolate represent 18 kb. Differences in plasmid size and sequence can be explained by notable sequence rearrangements, such as deletions, insertions, or duplications, because several repeat sequences have been identified through which such rearrangements might have occurred. For CB13, we were able to characterize sequences for plasmids QpH1 and QpDV, which can be caused by several situations: this isolate may have 1) 2 different plasmids, 2) a QpH1 plasmid and sequences of QpDV integrated in the chromosome, or 3) a new plasmid that arose from combination of QpH1 and QpDV. All these hypotheses are in agreement with the presence of QpH1 plasmid in the ancestor of C. burnetii isolates. This plasmid was lost by some of them (monophyletic group 3) but genetic information of crucial importance for the organism was integrated in the chromosome. For other isolates, QpH1 plasmid evolved to QpRS plasmid, in some isolates QpRS plasmid evolved to QpDV plasmid.
This study showed a correlation between QpDV and acute infections, between QpRS and chronic infections, and an association between some genotypes and disease type. A bias in sampling exists since acute disease is 20 times more frequent than chronic disease, but in this study, most of the human isolates were from chronic disease patients, and the isolates from acute infections were mainly obtained from France. These facts reflect the difficulty in isolating the bacteria. A genomic typing method such as MST could be applied directly to samples to obtain a more precise idea of how C. burnetii is spreading in the environment and the pathogenetic implications in acute and chronic forms of Q fever.
Comparison of DNA sequences is the best approach to investigate bacterial evolution. MLST in association with BURST analysis has been used to type isolates of many species. But this method is useful only if housekeeping gene diversity exists in the studied species. For example, in the species Yersinia pestis no diversity was found in the housekeeping genes studied (34). With the MST approach, differentiation of the 3 biovars Antiqua, Medievalis, and Orientalis was possible (25), which shows that the discriminatory power of MST is higher than that of MLST and is comparable to that of tandem repeats analysis (35). Low variability was found in C. burnetii housekeeping genes such as 16S rRNA (36) and rpoB (37). MST is the first method that allows a rapid and reliable typing of C. burnetii isolates during investigations of outbreaks by sequencing the PCR product obtained from the 10 spacers described. We did not test isolates from Australia and only 8 from the United States. Two isolates from Africa (Namibia and CB119) were considered as singletons in the BURST analysis denoting lack of closely related isolates. In the future, isolates that were not available in our laboratory during this study must be tested so the missing links in our phylogenetic analysis can be determined. The constitution of a database in a website will allow isolates from all the countries in the world to be compared and increase understanding of the propagation of the isolates of C. burnetii.